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Method for run- length encoding of a bitn^p data, str9aa 

Field of the invention 

5 This invention relates to a me thod for encoding a data 

stream,- particularly a bitmap ooded subtitling data stream. 
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Background 

Broadcast or read-only media containing video data may also 
comprrse subpicture data streams, containing textual or 
graphical information needed to provide subtitles, giyp h3 
or animation for any particular purpose, e.g. nenu IZll 
Since displaying of such information may usually be enabled 
or disabled, it is overlaid on the associated vLeo i^ 
es an additional layer, and is implemented as one or more 
rectangular areas called regions. Such region has soecified 
.set or attributes, like e.g. area sire, area posi'tion or 
background coior. Due to the region being overlaid on the 
vrdeo rmage, its background is often defined to be 
transparent so that the video image can be seen, or 
multiple subpicture layers can be overlaid. Further, a 
subtitle region may be broader than the associated image, 
so k hat only a portion of the subtitle region is visible, 
end the visible portion of the region is shifted e.g. from 
rrght to left rhrcugn the whole subtitle area, which looks 
as if the subtitles would shift through the displav. This 
method of pixel based subtitling is described in tne 
European fatent application EP02025474.4 and is called 
cropping. 
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Subtitles were originally meant as a support for 
handicapped people, or to save the costs for translating a 
film into rarely used languages, and therefore for pure 
subtitle text it would be enough if the subtitle data 
stream contained e.g. ASCII coded characters. But subtitles 
today contain also other elements, up to high-resolution 
images, glyphs or animated graphical objects. Handling of 
such elements is easier if the subtitling stream is cc -ded 
in bitmap format, with the lines of an area and the pixels 
within a line being coded and decoded successively. This 
format contains much redundancy, e.g. when successive 
pixels have the same color value. This redundancy can be 
reduced by various coding methods, e.g. run-length encoding 
(RLE) . RLE is often used when sequences of data have the 
same value, and its basic ideas are to code the sequence 
length and the value separately, and to code the most 
frequent code words as short as possible. 

Particularly when encoding the subtitle layer for 1920x1280 
pixels high-definition video (HDTV) , a coding algorithm 
that is optimized for this purpose is needed to reduce the 
required amount of data. 



Summary of the Invention 

25 

The purpose of the invention is to disclose a method for 
optimized encoding of subtitle or subpicture layers for 
high-resolution video, such as HDTV, being represented as 
bitmap formatted areas that may be much broader than the 
30 visible video frame. 

This method is disclosed in claim 1. An apparatus for 
encoding that utilizes the method is disclosed in claim 7. 
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i*r r r ua *« d ec 0di „ g tnat utill28s tha 

disclosed in claim 8. 13 

According to the invention/ four-stacre run i« «-w 
5 (RLE) is used for this run-length encoding 

usea tor this purpose, with the shortest eftrf * 

values other than transparent, tha sho J ^ 

words being US9d for ^ trans™ 

- « _ oolor othet l~ M r ~ 8s 

of tha pixels within th9 aubtitle la ; er ^ "-"^r. 
transp arent . other than tor convention.! «, Bhere tha 
most freguent data use tha shortest ^ Wh «« «» 

^ El! " «» — shortest ooda ZrTs t lTlZ 

apneas o, tha *cst freguent color, and tha 
shortest coda words for lonaer ^ 
fregu* color and also short seguences of othe7Z ra 
Shortest code words ana res,*™^ # ■ s> 

than tha B ost freguent col" T ° f ° th « 

• , coxor. This is advantageous urh«« 

P«ela of tha .cat freguent color aln cat ^-JT^Tm 

2ZT S [ " bei " 9 fM " tha 

Advantageously, a co de according to the lnvant .„ e 

defined tc ba among tha longer coda word*. E g a * . 
pixel of any color other th=„ > ' sin 3 le 

with » „ transparent is ideally coded 

With a oo de word of the shortest v . 

, °" eSl type, but a coda word of 

tha third shortest tvoa a » h. ., 

r type may be used as well, with the 
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sequence length being one. Though the latter possibility ' 
will usually not be used for this purpose, these unused 
code words, or gaps in the code word space, can be used for 
transportation of other information. An example is the end- 
of-line information that can be used for re synchronization. 
According to the invention, the shortest redundant code 
word is used to code this information. 

As another advantage, the disclosed method reduces the 
amount of required data, thus compressing the subtitle data 
stream, with the compression factor depending on the 
contents of the data stream. Particular high compression 
factors are achieved for data combinations that appear very 
often in typical subtitling streams. These are sequences of 
length shorter than e.g. 64 pixels that have the same color 
value, but also sequences of transparent pixels having any 
length and single pixels having individual color values. 
The first of these groups are often used in characters or 
glyphs, the second of these groups is used before, between 
and after the displayed elements of the subtitling stream, 
and the third of these groups is used in images, or areas' 
with slightly changing color, since transparent pixels 
hardly ever appear in very short sequences, e.g. less than 
three pixels, it is sufficient to code them not with the 
shortest but only with the second shortest code words. 

Simultaneously, the inventive method may handle efficiently 
sequences that are longer than 1920 pixels, and e.g. may be 
up to 16383 pixels long, thus enabling very wide subtitling 



30 areas. 



Further, the coding method generates a unique value ' 
representing the end of a line, and therefore in the case 
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of lose of synchronization it i, po Sslbl e to resynchronise 
each line. 

Advantageously, the inventive method is optimized for 
3 coding thie combination of a number of features being 
typical for subtitling streams. 

Therefore the amount of data required for the subtitling 
stream may be reduced, which leads to better utilisation of 
transmission bandwidth in the case of broadcast, or to a 
reduced pie K -up Jmp frequency in the case of storage media 
where a single picx-up reads multiple data streams, \Z 
e.g. in Blu-ray disc (3D) technology. Further, the better 
the subtitling bitmap is compressed, the higher capacity in 
. 5 terms of bit-rate will be Utt f or audio ^ vi J s ^ 
increasing picture or audio quality. 

Advantageous embodiments of the invention are disclosed in 



Brief description of the dr a wir, gq 



25 salary embodiments of the invention are described with 
rezerence to the accompanying drawings which ehow in 

• Plg.l cropping of a subtitle area in a video frame, 
Fig. 2 a pixel sequence in a subtitle area- 

30 grannie? 0 ^ tablS ^""^ text and 

graphics; and 

PiS. 4 a table with an exemplary syntax of „ extende(J 
object data segment for the slu-ray Prerecorded standard. 
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Detaile d description of the invention 

While subtitling in ore-produced audio-visual (AV> material 
for broadcast or movie discs is primarily optimized for 
representing simple static textual information, e.g. closed 
Caption, Teletext or DVB-Subtitle, progress in multimedia 
development for presentation and animation of textual and 
graphic information adequate to new HDTV formats requires 
an advanced adaptation for bitmap encoding. Fig.l shows a 
video frame TV and a subtitle area SUB containing text and 
graphical elements G, with the subtitle area SUB being 
bitmap coded. The size of the subtitle area SUB may exceed 
the. video frame dimensions, as e.g. for the Blu-ray Disc 
Prerecorded (BDP) format subtitle bitmaps are allowed for 
one dimension to be larger than the video frame. Then the 
lines are cropped before being displayed, i.e. a portion 
matching the respective frame dimension is cut out of the 
virtual line and displayed, overlaying the video image, in 
Fig.l, the subtitle area SUB of width B stJ8 is cropped, so 
that only a portion of width B CT is visible. For standard 
HDTV, as used e.g. for BDP, B TV is 1920 pixels, while B sm 
may be much more. 

Due to the rectangular shape of the subtitle area SUB, most 
pixels in that area are transparent. This is in an enlarged 
scale shown in Fig. 2, in a simplified manner since usually 
a line SL1,SL2 on a KDTV screen TV must be several pixels 
wide in order to be clearly visible. A line is herein 
understood as a horizontal structure. Each line of subtitle 
data usually contains one or more pixel sequences of equal 
color. Fig. 2 shows a part of a subtitle line SLi containing 
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transparent sequences W1(MS , but also single - vlsible 
prxels PS,, shortsr vl8iMe llnM ps2 Md 

and at the end Q f subtitUng ^ ^ 
oegrn and end with transparent sections, each line contains 
one more transparent than colored section. But transparent 
sect 10 „s mi.»m are usually longer, while for oixel" 

tTr:V th8r th£n tranSP " ent ' USedS - 9 - '«<*aract.rs, 
the .est frequent case is a sequence length of 64 or less 

This can be recognised from a rough estimation, assuming ' 

Z that !h" " Chara " e " "» dis P^ y ed simultaneously, 
=nd that the space between characters has about one quarto, 
the width of a character, so that a single character may 
use not more than 1920/25 . (9/10) . 62 plxels uithin * 

Often, a line 312 contains only very few visible pixels " 
^therefore only few transparent sequences that are very 

A code being a preferred embodiment of the invention is 
listed in It ts . run . length compri3i 

words of lengths ranging from a byte up to , , 
bits per byte, it is capable of coding 256 different 
colors, with one preferred color. The preferred color is in 
this example .transparent' , but may be any other color if 
adequate. A color look-up table (CLUT, may transform the 
decoded color values into the actual disolay color 

ranges"' ^ " ^ ^ «— " <»° 

ranges, wxth the shorter range being up to 63 pixels and 

the longer range being up to 16383 pixels. 

The shortest coda words of .1 byte length are used to code a 
sxngle pixel having any individual color other than the 
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preferred color, which is here transparent. The color value 
CCCCCCCC may range from 1 up to 255, and may represent a 
color directly or indirectly. E.g. it may represent an' 
entry in a color look-up table (GLUT) that contains the 
5 actual color code. One of the 8-bit values, containing only 
zeros (00000000), serves as an escape sequence, indicating 
that the following bits have to be considered as part of 
the same code word. In that case, the code word tree has 
four possible branches, marked by the two following bits. 
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In the first branch, indicated by the following bits being 
00, valid code words have two bytes, and a shorter sequence 
of up to 63 pixels is coded having the preferred color, 
e.g. transparent. The only invalid code word in this branch 
is the one that comprises only 0's, since 0 represents no 
valid sequence length. This code word *00000000 00000000' 
may be used for other purposes. According to the invention, 
it is used to indicate the end of a line since it is the 
shortest redundant code word. 

In the second branch, indicated by the following bits being 
01„, the code word comprises another byte, and the fourteen 
L bits are used to code the length of a pixel sequence of 
the preferred color, e.g. transparent. Thus, the sequence 
length may be up to 2«-l = 16383. The code words where the 
L bits have a value below 64 are redundant, and may be used 
for other purposes. 

In the third branch, indicated by the following bits being 
10b, the code words oomprise an additional byte, and the 
six L bits of the second byte represent the length of a 
shorter sequence of up to 63 pixels, which have another 
than the preferred color. The actual color is directly or 
indirectly represented by the CCCCCCCC value of the third 
byte. The code words with a sequence length LLLLLL below 
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three are redundant, since a sequence of one or two pixels 
or this color can be cheaper coded using one byte per 
Pixel as described above, and a sequence length of zero is 
-valid. These code words may be used for other purposes. 
In the fourth branch, indicated by the following bits being 
lib. uhe code words comprise two additional bvtes, wherein 
the remaining six bits of the second byte and the third 
byte give the length of a longer sequence of 64 up to 16383 
Pixels, and the color value CCCCCCCC of the fourth byte 
gives the color, directly or indirectly and not being the 
preferred color. The code words with a sequence length 
below 64 are redundant, since these sequences may be coded 
cheaper using the third branch. These code words may be 
used for other purposes. 

The redundant code words mentioned above may be used to 
extend the code, e.g. add internal check sums or other 
information. 
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The extended run-length encoding table shown in Pig. 3 -nd 
described above provides mainly two advantages. First, it 
allows for the most compact encoding of typical subtitle 
streams, including transparent areas, small graphical 
objects and normal subtitle text. Single pixels of any 
color, as used for small colorful graphics, are coded with 
a single byte. The dominant color, e.g. transparent for BDP 
subtitUng, is always encoded together with a run-length 
Run-length codes are available in two different sizes, or 
two pixel quantities, m a first step, run-lengths of up to 
63 p lX els are available as 2-byte code words for the 
dominant color, and as 3-byte code words for the oth«r 
colors. In a second step, run-lengths of up to 16383 pixels 
are available as 3-byte code words for the dominant color 
and as 4-byte code words for the other colors. The end-of- 
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pixel-string code, or ead-of-line code, is a unlcrue 2-byte 
code word that can be used for ^synchronization." Secondly 
the availability of longer sequences for the subtitling 
area up to 16383 pixels per code word, means a reduction 
or redundancy, and therefore of the amount of data This 
means that for applications with separate data streams 
sharing one channel, e.g. multiple data streams on an 
optical storage medium sharing the same pick-up, bagger 
portions of the subtitling stream may be loaded with the 
same amount of data, thus reducing the access frequency 
the subtitle stream. 

Another aspect of the invention i s * further optimization 
of the data stream for transport using transport oaokets 
e.g. in a packetized elementary stream <PE S ) . Due to the' 
large file size of bitmaps, the packaging of such data, 
e.g. in object data segments (OD S ) , is a problem. Often the 
maximum size of an ods is limited by other factors, e a 
PES packet si 2 e. To fit large bitmaps into such packed," it 
would be necessary to cut bitmaps into small bitmao pieces 
bexore coding, which reduces the compression efficiency. To 
overcome this bitmap splitting, a new extended object data 
segment (ExODS) for BDP or comparable applications is 
disclosed, as shown in Fig. 4. ExODS is a data structure 
representing each of the fragments into which an OD3 is cut • 
for fitting it into a sequence of limited size segments and 

packets - 'he complete ODS can be reconstructed by 
concatenating the sequence of individual pieces of 
consecutive ExODSs. 

The start and the end of a sequence of Ex0DS is indicated 

wL S T S - e " a "' firSt - in - Se ~ «* ^t_ln_se q uen=s. 
-hen .he z.rst.in.aequence flag is 1, a new sequence is 
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starting. An ExODS having set the f irst^in.sequence' flag to 
1 also indicates the size of the decompressed bitmap, by 
containing its dimension objectwidth and object height 
The advantage of indicating bitmap dimension is tt. supoort 
or target memory allocation before the decompression 
starts. Another advantage is, that the indicated bitmao 
dimensions can also be used during decoding for c-oss ' 
checking bitmap dimensions. When the last ln.eequ.nc flag 
xs set to 1, the last ExODS of a. complete CDS is indicated 
There may be ExODS having set neither the first_in_seauence 
nor the last_in_sequencs flag. These are ExODS oieces *in 
the middle of a sequence. ai so the case of having set both, 
the f a rst_in_seguence flag and the last_in_sequence flag, 
is possible if the ODS can be carried within a single 
ExODS. To overcome the limitation in size available for a 
single ODS by PES packet size within subtitling, the 
described type of ExODS may be introduced as a container 
for pieces of one ODS, e.g. for packaging large ODS *or 
HDTV application. Besides the ODS pieces, the ExODS also 
carries flags indicating if it ±s carrying the firgt q 
the last piece, a middle piece or the one but complete 
Piece of an ExODS sequence. Furthermore, if the first oiece 
in sequence of the ExODS is transmitted, the dimension's of 
the resulting ODS, i.e. height and width of the encoded 
bitmap, is contained in the segment. The indicated bitmap 
dimensions can also be used for a decoding cross check 



The inventive method can be used for compression of bitmap 
30 data streams containing e.g. text, images or graphics data 
for animation, menus, navigation, logos, advertisement, 
messaging or others, in applications such as e.g. Blu-Ray 
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Prerecorded (BDP) discs or generally high-definition video 
(HDTV) recordings or broadcast. 
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Claims 

1. A method for run-length encoding of a data stream, 
the data stream comprising bitmap formatted subtitle 
or menu data for video application, wherein a 
preferred color is defined, and the subtitle or menu 
data are displayed as a separate layer, overlaying 
other displayed data, and wherein 

- the shortest type' of code words is used for single 
Pixels having individual color values, the color 
not being the preferred color; 

- the second shortest type of code words is used for 
shorter sequences of pixels of the preferred 
color; 

- the third shortest type of code words is used for 
longer sequences of pixels of the preferred color, 
wherein the sequence length may exceed the width 
of the video display, and shorter sequences of 
pixels of other color; and 

- the fourth shortest type of code words is used for 
longer sequences of pixels of individual color, 
wherein the sequence length may exceed the width 
of the video display. 

2. Method according to claim 1, wherein the preferred 
color is transparent. 

3. Method according to any of the previous claims, . 

wherein redundant code words are used to code 

information not referring to pixels of the subtitle 
layer. 
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4. Method according to any of the 'previous claims, 

wherein the shortest redundant code word is used for 
line synchronisation. 



10 



15 



20 



25 



30 



5. Method according to any of the previous claims, 
wherein the shortest type of code words is one byte 
long, the shorter sequences have lengths up to 63 
and the longer sequences have lengths up to 16333. 

6. Method according to any of the previous claims, 
wherein the encoded data stream is distributed over 
multiple transport packets. 

7 . An apparatus for run-length encoding of a data 
stream, the data stream comprising bitmap formatted 
subtitle or menu data for video application, wherein 
a preferred color is defined, comprising 

- means for encoding single pixels having individual 
color other than the preferred color, using the- 
shortest type of code words; 

- - means for encoding shorter sequences of pixels of 

the preferred color, using the second shortest 
type of oode words; 

- means for encoding longer sequences of pixels of 
the preferred color, wherein the sequence length ■ 
may exceed the width of the video display, and 
shorter sequences of pixels of equal color other 
than the preferred color, using the third shortest 
type of code words; and 

- means for encoding longer sequences of pixels of 
equal color other than the preferred color, 
wherein the sequence length may exceed the width 
of the video display, using the fourth shortest 
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8 . An apparatus for run-length decoding of an encoded 
data stream containing compressed hitman formatted 
subtitle or menu data for video application, wherein 
a preferred color is defined, comprising 

- means for decoding single pixels having individual 
color other than the preferred color, using the 
shortest type of code words; 

- means for decoding shorter sequences of pixels of 
the preferred color, using the second shortest 
type of code words; 

- means for decoding longer sequences of pixels of 
the preferred color, wherein the sequence ienath 
may exoeed the width of the video display, and 
shorter sequences of pixels of equal color other- 
than the preferred color, using the third shortest 
type of code words; and 

- means for decoding longer sequences of pixels of 
equal color other than the preferred color, 
wherein the sequence length may exceed the width 
of the video display, using the fourth shortest 
type of code words. 

9. Apparatus according to claim 7 or 8, wherein the 
preferred color is transparent. 



10. 



Apparatus according to any of claims 7-9, further 
comprising means for encoding or decoding code words 
that are used to transmit information not referring 
to pixels of the subtitle layer. 
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11. Apparatus according to any of claims 7-10, 
wherein said encoded data stream is distributed 
multiple transport packets. 
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" PreSentation "t text information snd 
graphrcal data, encoded as pixeX bitBlaps . The s ^ "" d 

subtitle bitmaps may exceed video frame dimensions, SQ 

separate layer Xyin, above the video a o f „ k 

video s,,h*- iH -i. • °' e - 9 - for synchronized 

video subtitles, animations and navigation menus, and 

adaT-T 9 T taln ^ Pi- ls . An advanced 

adaP tl on for bitmap enoodin g for HDTV, e.g. 1920xl2s0 
prxeXs per frame as defined for the SXu-ray Diso 
Rerecorded format, providing optimized compression result 
*>r such subtitXing bitmaps, is achieved by a four stage 
run length encoding, shorter or Xonger seances 
of a preferred coior, e.g. transparent, are encoded IZl 
the econd or third shortest cede words, while single ' 
pzxeXa of different coXor are encoded using the shortest 
code words, and seguences of pixeXs of eguax ooXo ut th e 
.hird or fourth shortest code words. 



Fig. 3 
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Code 

cccccccc 



UOO0O000 OOLLLLLL 
OOOOUUUU 01LLLLLL LLLLLLLL" 



uuuuuuoo 10LLLLLL CCCCCCCC" 



OOOOUUUU 11LLLLLL LLLlZlSrcCCCCCCC 



Possible extensions 



Meaning 



one pixel in color c" 

(1 S C ^ 25 m 

L pixels an color- q" 

_(1 £ L S S3) 

L pixels in color 0 
11 4 S L £ 16363) 



L pixels in color C 
(3 S I, < 63, 
1 ^ C S 25=11 

L pixels in color C 
(64 <; i< £ 16383, 
^ C £ 255) 



end of li n a 



00000000 
00000000 
00000000 
00000000 
00000000 

oooooooo 




10000000 X 

10000001 X 
10000010 X 
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